Finding the Best page using Synonyms
نویسنده
چکیده
Rating a page to be a best one, based only on Page Ranking algorithm of Brin and Page would be insufficient. This method relied totally on Link information alone. However, due to application of Soft Computing in Data Mining and Knowledge Discovery, machines were made more effective, additional features of a Page involving its indexing, terms used, capitalizations, anchor texts, hit information, etc. were considered. The classification problem helped to induce this to a great extent. The complexity of dealing with a large number of web pages on the net made researchers to think of solutions dealing with sampling pages randomly and then making an analysis of the features of these pages. Soft Computing techniques were used for analysis of the features of the page. These techniques involved Genetic Algorithms, Neural Networks, Fuzzy Logic and Rough sets. User' profiles of pages were created from the retrieved ones. Good and bad Pages were categorised on the basis of the terms they contained and these profiles were preserved for further reference. Pages were compared with each other for their similarity using Jaccard score and Best First search algorithm with developed software agents. Adaptive methods were used. Such methods were close to the concept of Genetic algorithm applications. The frequency at which a user visited web pages was also considered as a parameter of interest. Techniques to generate features of pages using co-occurance analysis were developed and web pages were classified based on machine learning. A good
منابع مشابه
Finding the Best Page using Synonyms
Rating a page to be a best one, based only on Page Ranking algorithm of Brin and Page would be insufficient. This method relied totally on Link information alone. However, due to application of Soft Computing in Data Mining and Knowledge Discovery, machines were made more effective, additional features of a Page involving its indexing, terms used, capitalizations, anchor texts, hit information,...
متن کاملMining and Ranking Biomedical Synonym Candidates from Wikipedia
Biomedical synonyms are important resources for Natural Language Processing in Biomedical domain. Existing synonym resources (e.g., the UMLS) are not complete. Manual efforts for expanding and enriching these resources are prohibitively expensive. We therefore develop and evaluate approaches for automated synonym extraction from Wikipedia. Using the inter-wiki links, we extracted the candidate ...
متن کاملUBA: Using Automatic Translation and Wikipedia for Cross-Lingual Lexical Substitution
This paper presents the participation of the University of Bari (UBA) at the SemEval2010 Cross-Lingual Lexical Substitution Task. The goal of the task is to substitute a word in a language Ls, which occurs in a particular context, by providing the best synonyms in a different language Lt which fit in that context. This task has a strict relation with the task of automatic machine translation, b...
متن کاملSynFinder: A System for Domain-Based Detection of Synonyms Using WordNet and the Web of Data
The detection of synonyms is a challenge that has attracted many contributions for the possible applications in many areas, including Semantic Web and Information Retrieval. An open challenge is to identify synonyms of a term that are appropriate for a specific domain, not just all the synonyms. Moreover, the execution time is critical when handling big data. Therefore, it is needed an algorith...
متن کاملUsing a Bilingual Resource to Add Synonyms to a Wordnet: FinnWordNet and Wikipedia as an Example
This paper presents a simple method for finding new synonym candidates for a bilingual wordnet by using another bilingual resource. Our goal is to add new synonyms to the existing synsets of the Finnish WordNet, which has direct word sense translation correspondences to the Princeton WordNet. For this task, we use Wikipedia and its links between the articles of the same topic in Finnish and Eng...
متن کامل